Back

Modern Pathology

Elsevier BV

Preprints posted in the last 90 days, ranked by how well they match Modern Pathology's content profile, based on 21 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.

1
Unsupervised Tissue Concepts for Explainable Sarcoma Subtype Prediction from H&E

Bisson, T.; Ingram, D.; Singh, S.; Li, A.; Flynn, S.; Wang, W.-L.; Kim, A. E.; Bridge, C. P.; Demicco, E. G.; Sorrentino, A.; Jiang, S.; Hung, Y. P.; Lazar, A. J.; Iafrate, A. J.

2026-05-20 pathology 10.64898/2026.05.15.26353333 medRxiv
Top 0.1%
58.8%
Show abstract

Soft tissue sarcomas are a rare, heterogeneous group of tumors whose diagnosis remains challenging because of overlapping morphology and limited access to sarcoma-specialized pathologists. Although pathology foundation models have shown promise in computational pathology, their clinical translation remains limited by insufficient interpretability, particularly in diagnostically complex settings such as sarcoma diagnosis. Here, we developed and evaluated an H&E-based AI framework for sarcoma subtype classification that focused on explanability. Using the CONCH v1.5 foundation model, we computed embeddings from a tissue microarray cohort of 2,545 cases spanning 19 sarcoma subtypes and trained an attention-based multiple-instance learning model that achieved a balanced accuracy of 77.38% (SD 1.88). To move explainability beyond attention-based localization, we trained a sparse autoencoder on patch-level embeddings to learn 768 recurring visual concepts. 90 high-activation concepts were reviewed by three senior pathologists and curated into morphologically meaningful and non-meaningful categories, yielding a semantic dictionary of 41 diagnostically relevant tissue concepts. We then trained a linear attention-based model on the 768-concept vectors, which retained much of the performance of the raw embedding-based ABMIL model, achieving a balanced accuracy of 73.74% (SD 1.30). When restricting the linear model to pathologist-curated morphologic concepts only, balanced accuracy further decreased to 67.04% (SD 1.27), suggesting that the residual performance gain in the full concept model was driven by inconsistent, technical, or diagnostically irrelevant concepts. Concept-level explanations of the curated linear attention-based model aligned with known sarcoma morphology, including lipogenic, myxoid, spindle-cell, pleomorphic, vascular, small round blue cell, and matrix-forming patterns, and reproduced patterns of diagnostic overlap observed in human sarcoma pathology. Together, these results show that H&E-based foundation-model representations capture meaningful diagnostic structure within the known limitations of H&E in sarcoma diagnostics, but that their clinical value depends on whether this structure can be made interpretable to pathologists. Sparse autoencoder-derived concepts can address this critical gap by converting embedding-level signal into recurring morphologic patterns that pathologists can review and name, providing the foundation to link these patterns to subtype predictions. In doing so, this approach turns concept discovery into a practical form of diagnostic explanation, while also revealing where model performance is supported by recognizable histopathology and where it relies on diagnostically irrelevant or inconsistent visual patterns.

2
Interpretable morphology mapping of peripheral blood leukocytes using annotation-efficient artificial intelligence

Liu, Z.; Castillo, S. P.; Han, X.; Sun, X.; Hu, Z.; Yuan, Y.

2026-05-26 pathology 10.64898/2026.05.22.725537 medRxiv
Top 0.1%
52.6%
Show abstract

BackgroundPeripheral blood smears (PBS) review is labor-intensive, subjective, and challenging for rare or morphologically heterogeneous cell types in hematologic malignancies. Artificial intelligence (AI) offers a scalable alternative, but broader clinical translation is constrained by annotation burden and limited interpretability. MethodsWe developed an interpretable, annotation-efficient AI framework that learns leukocyte morphology through a two-stage process: label-free representation learning to construct a morphological embedding space, followed by supervised fine-tuning for cell type and morphological attribute classification. The model was trained and evaluated on 5,952 PBS images from cancer patients at MD Anderson Cancer Center, including blast cells, and 17,092 images from public sources. Active learning strategies were assessed to improve label efficiency, and interpretability was examined using saliency and embedding visualization. An interactive web application, HemoSight, was developed to support clinical review. FindingsThe framework achieved a macro-F1 score of 0{middle dot}96 for 9-way leukocyte classification on the internal test split and 0{middle dot}83 on the held-out patient cohort. Active learning substantially reduced annotation requirements, reaching peak performance with only 13{middle dot}3% of available labels and significantly improving learning efficiency across 8 of 9 cell types. The model generalized to classifying 11 leukocyte morphological attributes with a mean F1 score of 85{middle dot}8% and revealed structured morphological landscapes. Saliency maps, embedding visualizations, and the HemoSight application enabled transparent morphological inspection of model predictions, supporting confidence in model behavior and feasibility for clinical integration. InterpretationOur framework enables scalable, annotation-efficient, and interpretable modeling of leukocyte morphology, supporting the integration of AI-assisted PBS review for hematopathology workflows. FundingSeed funding from The University of Texas MD Anderson Cancer Center. Research in ContextO_ST_ABSEvidence before this studyC_ST_ABSPeripheral blood smear review is essential for diagnosing and monitoring hematologic malignancies, but manual case review is time-consuming and variable, particularly for rare or abnormal leukocyte types. Automated hematology analyzers are widely used to flag abnormal cells; however, they provide limited morphological insight and often require frequent manual correction, especially in cancer settings where disease and treatment alter cell appearance. Previous artificial intelligence approaches for leukocyte classification have shown promise, but most rely on fully supervised learning, require extensive expert annotation, focus on a limited set of cell types, and frequently exclude diagnostically important rare cells such as blasts. Interpretability is inconsistently addressed, and few studies provide tools that allow clinicians to inspect and interpret model outputs within routine workflows. Added value of this studyThis study introduces an annotation-efficient framework trained on a large collection of peripheral blood smear images, including cancer patient samples with hematopathologist-verified rare cell types such as blasts. The framework learns leukocyte morphology from unlabeled images and adapts to multiple classification tasks with minimal expert labeling. Performance is evaluated on both internal test splits and a held-out patient cohort to provide a realistic estimate of generalization. Iterative, uncertainty-guided annotation substantially reduces labeling requirements while improving learning efficiency across most leukocyte classes. Beyond cell-type classification, the framework is extended to 11 clinically relevant morphological attributes and reveals a structured morphological landscape. These capabilities are integrated into a web application, HemoSight, enabling real-time inference and transparent morphological inspection of predictions within hematopathology workflows. Implications of all the available evidenceAdvancing artificial intelligence for hematology requires methods that reduce expert labeling demands, provide interpretable outputs, and perform reliably across clinically diverse patient samples. This study shows that learning from largely unlabeled data combined with iterative expert annotation can support scalable and flexible modeling of leukocyte morphology for classification tasks. Integrating quantitative predictions and interactive visualization supports the use of artificial intelligence as an assistive tool for diagnostic peripheral blood smear review, with potential to improve efficiency, consistency, and reviewer confidence.

3
Platelets Outperform Leukocytes in Transcriptomic Liquid Biopsy Profiling of Myeloproliferative Neoplasms

Shen, Z.; Sawalkar, A.; Wu, J.; Natu, V.; Rowley, J.; T. Rondina, M.; Krishnan, A.

2026-04-01 pathology 10.64898/2026.03.30.714941 medRxiv
Top 0.1%
34.5%
Show abstract

Myeloproliferative neoplasms (MPNs) are characterized by progressive myelofibrosis that drives morbidity and mortality. Liquid biopsy approaches to noninvasively monitor fibrotic progression remain limited. We performed comparative transcriptomic profiling of CD45-depleted platelet-enriched and CD45+ leukocyte-enriched fractions from matched peripheral blood samples of 76 individuals (27 primary myelofibrosis, 17 polycythemia vera, 14 essential thrombocythemia, 18 healthy controls). Platelet RNA sequencing was performed in 2018-2020 on Illumina HiSeq 4000, while WBC RNA sequencing was conducted in 2023 on Illumina NovaSeq 6000 from cryopreserved CD45+ enriched fractions of specimens obtained at the identical time and from the same blood sample as the platelet RNA. Despite comparable library preparation protocols and higher sequencing depth in WBC samples, platelet transcriptomes exhibited 5.1-fold more differential expression in myelofibrosis (3,453 versus 681 genes, adjusted p<0.05, |log2FC|>1). Platelet signatures were enriched for proteostasis pathways including endoplasmic reticulum stress and unfolded protein response, reflecting megakaryocyte dysfunction in the fibrotic bone marrow niche. WBC signatures predominantly featured immune activation and proliferative pathways, indicating systemic inflammatory responses. Multinomial LASSO classification demonstrated superior performance of platelet-based models for myelofibrosis diagnosis (AUROC 0.85) compared to WBC-based (AUROC 0.77) or clinical models (AUROC 0.59). Combined platelet+WBC models did not improve performance (AUROC 0.80), indicating complementary but non-additive information. These findings establish platelet transcriptomic profiling as a superior noninvasive biomarker platform for monitoring myelofibrosis in MPNs, capturing megakaryocyte-driven fibrogenesis with greater sensitivity than peripheral leukocyte-based approaches. HighlightsUsing matched WBC and platelet RNA-seq from MPN patients, we identify myelofibrosis-associated transcriptomic signatures specifically enriched in platelets. Multinomial LASSO modeling highlights platelet-derived gene expression as a dominant and predictive biomarker of myelofibrosis, outperforming clinical parameters and WBC signatures. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=75 SRC="FIGDIR/small/714941v1_ufig1.gif" ALT="Figure 1"> View larger version (21K): org.highwire.dtl.DTLVardef@1d695aborg.highwire.dtl.DTLVardef@fc250forg.highwire.dtl.DTLVardef@1e52e8eorg.highwire.dtl.DTLVardef@15378e3_HPS_FORMAT_FIGEXP M_FIG C_FIG

4
Cytoplasmic staining of T cell receptor components enables efficient assessment of lineage and clonality in surface CD3-negative T cell neoplasms

Wilk, A. J.; Gitana, G.; Oak, J.

2026-06-04 pathology 10.64898/2026.06.02.26354783 medRxiv
Top 0.1%
28.0%
Show abstract

Flow cytometry can establish T cell clonality by detecting a restricted expression pattern of the T cell receptor (TCR) {beta} constant region (TRBC), expressed in association with CD3. However, T cell neoplasms frequently lose surface expression of the CD3/TCR complex, posing a challenge to demonstrating T cell lineage and clonality. To address this challenge, here we present a 12-color flow cytometry panel, called cytoTCR, to characterize cytoplasmic expression of CD3/TCR complex components. We apply cytoTCR to 38 patient specimens with immunophenotypically abnormal T cell populations, demonstrating this approach can efficiently establish T cell lineage and clonality in challenging T cell neoplasms that have lost surface CD3 expression. While we show that natural killer (NK)-lineage neoplasms can express cytoplasmic CD3 at similar levels to T cells, we show that absent expression of cytoplasmic TCR components by mature lymphocytes can help confirm NK cell lineage. We demonstrate that cytoTCR can detect cytoplasmic TRBC-restriction in challenging cases of null-phenotype anaplastic large cell lymphoma, which lack surface expression of pan-T cell antigens. In cases of T-lymphoblastic leukemia, cytoTCR shows that cytoplasmic TRBC expression matches the expected developmental stage of the leukemia. Finally, we use cytoTCR to characterize atypical cCD3-CD7- T cells in a patient with a history of T-lymphoblastic leukemia as well as recent CAR-T therapy, showing that this atypical population is polytypic and represents CAR-T product rather than residual disease. Our study presents a broadly applicable flow cytometric approach to simultaneously assess T cell lineage and clonality in suspected T lineage populations with absent surface CD3 expression.

5
DIANNE: Segmentation-Free Localization of Histology Differential Attributes

Domanskyi, S.; Rubinstein, J. C.; Sheridan, T. B.; Thiesen, A.; Noorbakhsh, J.; Alcoforado Diniz, J.; Ramasamy, R.; Baker, D. S.; Sheldon, R.; Wu, Q.; Kuchel, G.; Robson, P.; Chuang, J. H.

2026-05-01 pathology 10.64898/2026.04.28.721103 medRxiv
Top 0.1%
26.5%
Show abstract

Pathologist-guided distinctions within histology and spatial omic images provide insights into health and disease, with digital pathology leveraging artificial intelligence to automate such assessments. To train computational models, current digital pathology methods rely on upfront manual annotations, which are time-consuming to generate. Pre-annotation is poorly suited to investigating novel spatial behaviors--a major need driven by advances in spatial profiling--for which annotation criteria and data needs will be uncertain. To address these challenges, we present DIANNE, a digital pathology approach for rapid training and inference of spatial differential attributes based on train-time Positive Class Mixup Augmentation. DIANNE can compute foundation model-derived segmentation-free localization of differential classifiers across whole slide H&E images within seconds on a workstation, enabling interactive investigation of spatial niches. Predictive models can be re-trained in real-time in response to patch or regional annotation changes, clarifying determinative biological attributes across slides from only a few dozen annotated patches. We demonstrate the effectiveness of DIANNE for tumor detection, artifact identification, and exploration of pancreatic, fetal membranes and kidney tissue structures. DIANNE also provides analogous capabilities for IHC, multiplex immunofluorescence, and registered spatial transcriptomic+H&E images. DIANNE is implemented in a Jupyter toolkit, enabling rapid development of high-resolution classifiers from weakly-supervised training. DIANNE provides a practical system to quantitatively understand known and novel spatial phenotypes.

6
Closing the Paediatric Gap: Adult-Trained AI Generalises Robustly to Paediatric Coeliac Disease Diagnosis

Jaeckle, F.; Gillett, P. M.; Kirkwood, K. J.; Natu, S.; Chan, J. Y. H.; Bateman, A. C.; Arends, M. J.; Soilleux, E. J.

2026-06-05 pathology 10.64898/2026.06.04.26354889 medRxiv
Top 0.1%
23.9%
Show abstract

Background Coeliac disease (CD) diagnosis on duodenal biopsies is limited by interobserver variability. We have previously demonstrated pathologist-level performance with our artificial intelligence (AI) model for the histopathological diagnosis of adult CD, but not in paediatric practice. As paediatric CD screening programmes expand internationally, accurate and scalable diagnostic tools are needed. We investigated whether an AI model trained exclusively on adult whole-slide images (WSIs) can generalise to paediatric CD diagnosis across independent centres. Methods A training and validation dataset of 9,958 WSIs from 8,421 adult patients (961 CD) from five centres was used to develop an ensemble of multiple-instance learning models using features from a foundation model. Testing was performed on 708 consecutive paediatric patients (86 CD) from two centres (Edinburgh and Southampton) not included in training. Model calibration was assessed, and probability outputs were grouped into clinically interpretable categories. Findings In adult cross-validation, the AI model achieved an area under the receiver operating characteristic curve (AUC) of 98.7%, sensitivity of 84.9%, specificity of 99.0%, and negative predictive value (NPV) of 98.1%. On testing (paediatric) datasets, performance remained high (AUC 98.8%, sensitivity 80.2%, specificity 98.4%, NPV 97.3%). Restricting analysis to predictions outside the intermediate-probability range (predicted CD probability <10% or [&ge;]65%; 85.3% of cases) improved sensitivity to 100% and specificity to 98.7%. No misclassifications were observed among high-confidence predictions (<2% or [&ge;]85%; 66.0% of cases). The expected calibration error was 0.03. Performance improved significantly when biopsies from both duodenal sites (bulb [D1] and descending [D2/3]) were considered. Interpretation Our AI model, trained on adult biopsies, generalises to paediatric CD diagnosis across centres and scanner platforms. Well-calibrated probability outputs provide clinically interpretable measures of diagnostic confidence and could support safe identification of CD-negative biopsies within defined thresholds. These findings demonstrate the feasibility of applying adult-derived AI models in paediatric populations and reinforce the importance of multi-site (D1 & D2) biopsy sampling.

7
Interpretable machine learning for coeliac disease diagnosis: quantitative morphometry of duodenal biopsies

Bryant, R.; Romero Diaz, J.; Scott, A. G.; Sagdeo, A. A.; Jenkins, G. Z.; Richardson, R. A.; Chan, J. Y. C.; Arends, M. J.; Soilleux, E. J.; Jaeckle, F.

2026-06-03 pathology 10.64898/2026.06.02.26354731 medRxiv
Top 0.1%
22.9%
Show abstract

Background Coeliac disease affects approximately 1% of the global population and remains substantially underdiagnosed. Histopathological assessment of duodenal biopsies is the diagnostic gold standard but is subject to approximately 20% inter-observer disagreement. While machine learning approaches show promise, most prior work relies on black-box models with limited interpretability, restricting clinical adoption. Methods We present an interpretable pipeline that follows established histopathological criteria by extracting clinically meaningful morphological features from H&E-stained whole-slide images. Five sequential stages perform pre-processing, semantic segmentation of villi, crypts, intraepithelial lymphocytes (IELs) and enterocytes, crypt morphometry, villus length estimation via a novel polyline-based keypoint model, and coeliac disease classification using three quantitative features: IEL-to-enterocyte ratio, villus-to-crypt area ratio, and villus-length-to-crypt-depth ratio. Training and validation used data from four institutions; independent testing used 1,357 WSIs from two further institutions including one with a previously unseen scanner manufacturer, spanning five diagnostic categories: coeliac disease, normal mucosa, chronic inflammation, gastric metaplasia, and gastric heterotopia. Results Semantic segmentation achieved villus and crypt precision and recall of 87-90%. Villus length estimation correlated strongly with expert annotations (Pearson's r=0.85, mean relative error 13.5% post-calibration). All three morphological features significantly separated coeliac disease from all non-coeliac diagnostic groups across internal and external datasets (p<0.01 in all comparisons). On the test set the diagnostic classifier achieved accuracy 94.5%, PPV 92.9%, NPV 94.7%, and AUC 0.982. Conclusions This interpretable framework achieves strong multi-centre diagnostic performance while producing quantitative morphological outputs, villus length, crypt depth, and IEL-to-enterocyte ratios, that directly reflect established histopathological criteria, representing a meaningful step towards standardised AI-assisted coeliac disease diagnosis.

8
SortIT - A Tool For Assessing Observer Variability And Creating Ground Truth Image Classification Datasets

Uegami, W.; Bisson, T.; Okoshi, E. N.; Costa da Silva, F. G.; Jiragawasan, C.; Zerbe, N.; Bychkov, A.; Fukuoka, J.

2026-05-29 pathology 10.64898/2026.05.28.728616 medRxiv
Top 0.1%
22.9%
Show abstract

Interobserver variability in pathological assessments is a well-recognized challenge that impacts diagnostic reliability and disease understanding. This variability exists across many subspecialties due to the subjective nature of evaluations. Artificial intelligence (AI) applied to whole slide images has potential to standardize procedures and reduce variability in pathology, but transitioning to these technologies does not guarantee improvement. Establishing reliable ground truth datasets with consensus annotations is crucial for developing robust AI solutions. We introduce SortIT, an open-source web application that facilitates systematic creation and evaluation of ground truth image tile annotations. SortIT enables multiple annotators to independently label tiles, with flexible user permission controls. Annotated data can be exported for statistical analysis of observer variation and for creating ground truth datasets from consensus tiles. We outline protocols using SortIT for several use cases: (1) mitosis segmentation in tumor regions, (2) evaluating AI solutions for prostate cancer grading by comparing to expert consensus, and (3) granuloma classification by annotating discriminative tile-level features. Key strengths of SortIT lies in its ease of deployment, making it accessible and usable for a wide range of users. Overall, SortIT provides a valuable tool to establish high-quality ground truth datasets and comprehensively assess observer variability. Critical evaluation of ground truth quality using systematic annotation methodologies is crucial for developing accurate and generalizable diagnostic AI tools. Its open-source nature facilitates community adoption and further development.

9
Whole slide image analysis of the endometrial decidual reaction reveals multiscale perturbations associated with miscarriage

Wright, G.; Rawlings, T. M.; Eastwood, M.; Brighton, P.; Taus Nebot, M.; Estermann, A.; Flett, W. T. M.; Younis, A.; Makwana, K.; Yoshihara, H.; Aplin, J. D.; Kong, C.-S.; Christian, M.; Lucas, E. S.; Muter, J.; Brosens, J. J.; Minhas, F.

2026-05-26 pathology 10.64898/2026.05.22.727262 medRxiv
Top 0.1%
22.3%
Show abstract

The inflammatory decidual reaction renders the cycling endometrium transiently permissive for embryo implantation before transforming it into the decidua, the maternal bed accommodating the fetal placenta during pregnancy. Disruptions in decidual tissue remodeling are linked to miscarriage and other pregnancy disorders. However, endometrial assessment is hampered by a lack of affordable technologies capable of mapping the spatiotemporal dysregulation of this dynamic and complex tissue. Employing a graph neural network on whole slide images of 493 CD56-immunostained endometrial samples, Endometronome was developed as a deep learning tool to spatially track the decidual reaction and provide accurate estimates of marker gene expression. When applied to 2,690 additional biopsies, this model consistently identified morphological correlates of prior miscarriage burden, a proxy for future risk. Further, a morphological signature indicative of metabolic glandular impairment discriminated between clinical miscarriage presentations. These findings illustrate how advanced imaging analysis of routine histology can transform miscarriage prevention strategies.

10
Assessing Foundation Models for Computational Pathology in Endometrial Cancer

Volinsky-Fremond, S.; van den Berg, N.; Barkey Wolf, J.; Schoenpflug, L. A.; Andani, S.; Ortoft, G.; Jobsen, J. J.; Lutgens, L. C.; Powell, M. E.; Mileshkin, L. R.; Mackay, H.; Leary, A.; Razack, R. R.; de Bruyn, M.; de Boer, S. M.; Nout, R. A.; Smit, V. T.; Creutzberg, C. L.; Koelzer, V. H.; Bosse, T.; Horeweg, N.

2026-05-25 pathology 10.64898/2026.05.22.26353897 medRxiv
Top 0.1%
22.2%
Show abstract

Computational pathology leverages deep learning to extract clinically relevant information from digitized tumor slides, predicting histopathological subtypes, molecular alterations, and patient outcomes. Recent pipelines increasingly rely on foundation models trained on large pan-cancer datasets to generate generalizable features. In endometrial cancer (EC), their comparative performance for clinical diagnostic tasks remains unexplored. For the first time, this study evaluates the performance of seven state-of-the-art foundation models across morphological, molecular, and prognostic tasks using a large EC dataset of 3,293 patients from randomized trials and clinical cohorts. In addition, their performance was compared to one model (EsVIT) exclusively trained on EC. The foundation models H-OPTIMUS-0, CONCH, and VIRCHOW2, achieved the highest mean performance, but the best-performing foundation model varied by task. The top-performing foundation model outperformed the EC-specific feature extractor EsVIT across all tasks. This study highlights the superiority of foundation models over a domain-specific feature extractor in EC. Selecting the optimal foundation model for novel tasks remains challenging due to performance plateaus and limited information on the training datasets, requiring rigorous benchmarking and domain insight to reach maximum potential.

11
Artificial Intelligence Devices for Image Analysis in Digital Pathology

Matthews, G. A.; Godson, L.; McGenity, C.; Bansal, D.; Treanor, D.

2026-03-26 pathology 10.64898/2026.03.23.26349089 medRxiv
Top 0.1%
19.2%
Show abstract

BO_SCPLOWACKGROUNDC_SCPLOWThere is increasing momentum behind the clinical implementation of AI-based software for image analysis in digital pathology. As regulations, standards, and national approaches to the clinical use of AI continue to develop, the marketplace of AI products is expanding and evolving - presenting pathologists with a multitude of devices that offer the potential to improve pathology services. MO_SCPLOWETHODSC_SCPLOWTo maintain pace with this changing AI device landscape, we conducted a comprehensive search for, and analysis of, commercial AI products for image analysis in digital pathology. This included CE-marked and Research Use Only (RUO) products using images with histological stains (e.g., H&E) or immunohistochemical (IHC) labelling. Product information and published clinical validation studies were assessed, to understand the quality of supporting evidence on available products, and product details were compiled into a public register: https://osf.io/gb84r/overview. RO_SCPLOWESULTSC_SCPLOWIn total, we identified and assessed 90 CE-marked and 227 RUO AI products. We found that AI products for cancer detection in prostate and breast pathology comprised a substantial portion of the marketplace for H&E image analysis, while IHC products were almost exclusively for use in breast cancer. Clinical validation studies on these products have steadily increased; however, we found that published studies were only available for just over half of H&E products and just over a quarter of IHC products. For CE-marked products, the dataset quality and diversity for AI model performance validation was highly variable, and particularly limited for IHC products. Furthermore, only a limited number of products included studies that assessed measures of clinical utility. CO_SCPLOWONCLUSIONC_SCPLOWAs clinical deployment of AI products for image analysis in histopathology grows, there is a need for transparency, rigorous validation, and clear evidence supporting clinical utility and cost-effectiveness. Independent scrutiny of the expanding offering of AI products provides insight into the opportunities and shortcomings in this domain.

12
HistoSB-Net: Semantic Bridging for Data-Limited Cross-Modal Histopathological Diagnosis

Bai, B.; Shih, T.-C.; Miyata, K.

2026-03-26 pathology 10.64898/2026.03.23.713838 medRxiv
Top 0.1%
18.9%
Show abstract

Vision-language models (VLMs) provide a unified framework for multimodal reasoning, yet their representations are primarily learned from natural image-text corpora and often exhibit semantic misalignment when transferred to histopathology, particularly under data-limited diagnostic settings. To address this limitation, we propose HistoSB-Net, a semantic bridging network designed to adapt pre-trained VLMs to multimodal histopathological diagnosis while preserving their original semantic structure. HistoSB-Net introduces a constrained semantic bridging (CSB) module that operates within the self-attention projection space of both vision and text encoders. Instead of employing explicit cross-attention or full fine-tuning, CSB adaptively modulates pre-trained attention projections through a lightweight nonlinear semantic bottleneck, enabling structured cross-modal regulation with limited additional parameters. The framework supports both patch-level and whole-slide image (WSI)-level diagnosis within a unified architecture. Experiments on six pathology benchmarks, comprising two WSI-level and four patch-level datasets, demonstrate consistent improvements over zero-shot inference across 36 backbone-dataset combinations under limited supervision. Further analysis of prototype-based margin distributions and confusion matrices shows that these improvements are accompanied by enhanced intra-class compactness and increased inter-class separation in the embedding space. These results indicate that CSB provides an effective and computationally manageable strategy for adapting pre-trained VLMs to data-limited digital pathology tasks.

13
Spatial transcriptomic analysis reveals coordinated gene expression in ovarian clear cell carcinoma and adjacent endometriosis in UK and Japanese patients

Kuroda, T.; Giannone, G.; Ennis, D. P.; Mirza, H. B.; Marks, D.; Flood, L.; Sisley, M.; Griffin, R.; Desai, S.; McDermott, J.; Lambie, N.; Fukasawa, N.; Kiyokawa, T.; Shimoda, M.; Saito, M.; Koba, T.; Saito, R.; Kawabata, A.; Takenaka, M.; Valabrega, G.; Matthews, N.; Tookman, L. A.; Yanaihara, N.; Okamoto, A.; McNeish, I. A.

2026-06-02 pathology 10.64898/2026.05.29.728698 medRxiv
Top 0.1%
18.6%
Show abstract

PurposeOvarian clear cell carcinoma (OCCC) is strongly associated with endometriosis and shows geographic variation in incidence. We investigated whether OCCC and adjacent endometriosis exhibit distinct transcriptional states and whether these patterns differ between United Kingdom (UK) and Japanese cohorts. Experimental DesignWe performed whole-transcriptome spatial profiling on specimens from 16 OCCC cases (8 UK, 8 Japan) in which tumor and endometriosis were both present. Gene expression was analyzed in tumor, endometriosis and stroma. ARID1A status was assessed by immunohistochemistry. ResultsMedian age was 59 years (range 26-82). 13/16 cases (81.3%) had early-stage disease. Tissue compartment rather than cohort of origin was the dominant source of variation across endometriosis and tumor regions. Endometriosis was enriched for inflammatory and immune-related pathways compared to tumor, whilst there was greater representation of chromatin and protein-DNA complex assembly pathways in tumor regions. These patterns were conserved across both cohorts and after stratification by ARID1A status. Mesenchymal-associated gene expression scores also significantly differed across stroma, endometriosis and tumor with clear compartmental separation. Cell type deconvolution analyses showed clear compositional differences between stromal and epithelial disease compartments. ConclusionsOCCC and coexisting endometriosis are transcriptionally distinct, with the dominant contrast being compartmental rather than geographic. ARID1A alone is unlikely to account for the principal spatial transcriptional states identified here. Further analyses will be required to ascertain whether these differences reflect genuine biological differences between OCCC and coexisting endometriosis or represent different stages of endometriosis-associated tumorigenesis. Translational RelevanceOvarian clear cell carcinoma often arises in association with endometriosis, yet the biological transition between these lesions remains poorly understood. Using spatial transcriptomics in matched tumor and adjacent endometriosis from Japanese and UK cohorts, we showed that endometriosis is characterized by inflammatory and antigen-presentation features, whereas tumor regions showed chromatin-organization and oncogenic transcriptional states. These patterns were largely maintained irrespective of ARID1A status and geographic background. In addition, spatial deconvolution suggested differences in local immune composition, with tumor regions showing relatively greater neutrophil- and T cell-associated signals. Together, our data suggest that OCCC and coexisting endometriosis share a spatially linked tissue context, but that tumor regions have distinct transcriptional profile and microenvironment that may be involved in the malignant transformation and inform interpretation of molecular classification in endometriosis-associated OCCC.

14
Interpreting and Validating a Deep Learning Model Predictive of Spatial Morphologic-Molecular Patterns in Lung Adenocarcinoma, Using Ground Truth Immunohistochemistry

Rao, V. R.; Workman, A. A.; Palisoul, S. M.; Limoge, C. J.; Vaickus, L. J.; Zanazzi, G. J.; Lu, L.; Liu, X.; Sukhadia, S. S.

2026-04-23 pathology 10.64898/2026.04.20.719723 medRxiv
Top 0.1%
17.9%
Show abstract

Lung adenocarcinoma (LUAD), the most common subtype of non-small cell lung cancer, exhibits profound histological and molecular heterogeneity. While genomic profiling has identified key oncogenic drivers and immune signatures, its use is limited by cost, technical demands and tissue availability. In addition, spatial transcriptomics provides spatially resolved molecular insights but remains challenging and time-consuming. To address this gap, we developed XpressO-Lung, an explanatory deep learning model that predicts gene expression heterogeneity spatially in tumor and its microenvironment on hematoxylin and eosin based diagnostic (Dx) whole-slide images (WSIs) by learning associations between tissue morphology and the corresponding bulk-transcriptomic data. Utilizing 200 LUAD cases from The Cancer Genome Atlas, XpressO-Lung predicted spatial expression patterns of NAPSA, TP53I3, CD8A, TTF1, KRT7, CDKN2A, FOXO1, KEAP1, RB1 and TP53 on Dx-WSIs with AUCs ranging from 0.64 to 0.92. The predicted spatial gene expression patterns aligned with the known morphologic interactions of the tumor and its microenvironment, capturing biological events directly on Dx-WSIs. These spatio-morpho-molecular associations were further validated using immunohistochemistry on an external set of clinical samples at Dartmouth Health, demonstrating concordance between model-predicted spatial patterns and observed histomorphologic features. By coupling predictive performance with spatial interpretability of gene expression on Dx-WSIs, the XpressO-Lung model bridges histopathology and bulk-transcriptomics, enabling explainable spatio-morpho-genomic analyses to advance biomarker discovery, therapeutic stratification and precision oncology in LUAD.

15
Foundation model-based tool for automated ulcerative colitis histology scoring demonstrates non-inferiority to pathologists across multiple scoring indices

Tahir, W.; Shamshoian, J.; Tauber, J.; Clinton, L. K.; Griffin, M.; Shah, C.; Singh, G.; Fahy, D.; Sucipto, K.; Brosnan-Cashman, J.; Altepeter, T. A.; Bhattacharya, S.; Crandall, W.; Duan, C.; Gale, J. D.; Gupta, V.; Haarmann, H.; Harpaz, N.; Hooper, A. T.; Horowitz, J.; Hurtado-Lorenzo, A.; Hussaini, B. E.; Jairath, V.; Jones, A.; Kostiuk, B.; Kukreja, A.; Laroux, F. S.; Lissoos, T.; McBride, R. B.; Najdawi, F.; Nayyar, A.; Osterman, M. T.; Panchal, P.; Ruane, D.; Travis, S.; Visvanathan, S.; Wilson, L.; Jayson, C.

2026-06-11 pathology 10.64898/2026.06.09.26355212 medRxiv
Top 0.1%
14.8%
Show abstract

In clinical trials for ulcerative colitis (UC), pathologists assess disease severity through standardized histological indices, including the Geboes Score, Robarts Histopathology Index (RHI), and Nancy Histologic Index (NHI). Despite strong associations with clinical outcomes, histologic scoring suffers from inter- and intra-reader variability, and consensus criteria for histologic remission remain uncertain. Through a consortium approach, we developed an artificial intelligence-based measurement (AIM) tool for scoring histology in UC mucosal biopsies (AIM-HI UC). This model, trained on a large dataset of UC biopsies (N=10,230), utilizes additive multiple instance learning models leveraging PLUTO, a pathology foundation model, that predict each of the Geboes subgrades, from which the Geboes grade-level score, RHI, and NHI can be calculated. Evaluation of this model on a standalone verification set including clinical trial specimens established algorithm non-inferiority and/or superiority relative to standard qualified pathologists through comparison of algorithm-consensus and pathologist-consensus agreement metrics (non-inferior if difference >-0.1, superior if difference >0, inclusive of confidence intervals). AIM-HI UC was determined to be non-inferior to pathologists (N=3) for the prediction of all seven Geboes subgrades, grade-level Geboes, RHI, NHI, histologic improvement (GS<3.1), 2A histologic remission (GS<2A.0), and 2B histologic remission (GS<2B.0). AIM-HI UC was superior to pathologists for several Geboes subgrades (GS 0, GS 1, GS 2B, and GS 5), as well as grade-level Geboes, RHI, and positive percent agreement of 2A histologic remission. The model was shown to be greater than 99% repeatable for all histologic scoring metrics examined. Model-derived scores were shown to strongly correlate with canonical histologic features of inflammation, including the proportion of total epithelium that is inflamed (Spearman r=0.83; p<0.01), the proportion of neutrophils localized within crypt epithelium (Spearman r=0.83, p<0.01), and the amount of mucosal area classified as erosion or ulceration (Spearman r=0.80, p<0.01). Overall, these results suggest that AIM-HI UC has the potential to improve consistency of UC histology interpretation, providing a path toward standardization of UC histology scoring in clinical trials.

16
Artificial intelligence-assisted ganglion cell detection in Hirschsprung's disease: A comparative evaluation of two deep learning approaches

Wang, E.; Grenier, K.; Savadjiev, P.; Poenaru, D. D.

2026-06-12 pathology 10.64898/2026.06.11.26354826 medRxiv
Top 0.1%
14.3%
Show abstract

Background. Definitive diagnosis of Hirschsprung's disease (HD) requires pathological identification of enteric ganglion cells. This process is time-consuming and subject to inter-observer variability. Artificial intelligence (AI) tools have the potential to standardize and accelerate this workflow, but no study has determined which AI approach best serves intraoperative HD pathology diagnostics. Method. This study compared the U-Net and You Only Look Once version 26 (YOLO26) frameworks for ganglion cell detection using a single-centre retrospective dataset of 54 whole-slide images (WSIs) from rectal biopsies. WSIs were tiled into 397,731 image patches (128x128 pixels), further partitioned into training (70%), validation (15%), and testing (15%) sets. Models were evaluated on tile- and patient-level diagnostic metrics and processing latency. Results. The U-Net achieved a tile-level sensitivity of 82.9%, showing no statistically significant difference compared to YOLO26 (79.1%; p = 0.097). However, YOLO26 demonstrated a statistically significant advantage in tile-level specificity (96.1% vs. 93.9%; p < 0.001) and reduced mean inference latency (7.64 ms vs. 11.57 ms/tile). At the patient level, both models achieved 100% diagnostic sensitivity. Despite low patient-level specificity (0.0% U-Net; 11.8% YOLO26), the tissue-level diagnostic burden of false positives was 6.00% for U-Net and 3.50% for YOLO26. Conclusion. The U-Net is preferred when nominal gains in sensitivity are prioritized, while the YOLO26 is an alternative that optimizes efficiency and false positive suppression. Both models serve as robust screening filters to augment the pathologist's workflow and should be selected based on workflow requirements. Prospective validation on larger, multi-centre datasets is required before clinical implementation.

17
Cross-Modal Training Using Xenium Spatial Transcriptomics Enables DINO-DETR Based Detection of Vascular Niches in H&E Whole-Slide Images

S, P.; Alugam, R.; Gupta, S.; Shah, N.; Uppin, M. S.

2026-03-19 pathology 10.64898/2026.03.17.712266 medRxiv
Top 0.1%
12.4%
Show abstract

BackgroundTumor vasculature is a key driver of glioma progression, yet routine quantification depends on subjective histopathologic assessment or resource-intensive ancillary immunohistochemistry. A scalable, objective method for vascular phenotyping from routine histology remains an unmet need. MethodsWe leveraged 10x Genomics Xenium spatial transcriptomics data from a glioblastoma specimen to generate molecularly resolved annotations of GBM-associated endothelial cells and pericytes across 809,041 cells. These annotations were transferred to matched H&E-stained sections to train a DINO-DETR-based object detection model using a binary classification scheme (vascular vs. other). The model was validated on four independent Xenium patient slides and applied to a retrospective cohort of 119 diffuse gliomas spanning WHO grades 2-4 (oligodendroglioma, astrocytoma, and glioblastoma) with linked survival data. ResultsBinary vascular cell detection achieved a precision of 0.78, a recall of 0.63, and an F1 score of 0.70, with an overall accuracy of 98.6%. Orthogonal spatial validation confirmed that predicted vascular cells were preferentially localized within annotated blood vessel regions. In subtype-stratified survival analysis, high AI-derived vascular cell proportion was significantly associated with worse overall survival in astrocytoma patients (log-rank p < 0.019). ConclusionCross-modal AI training using spatial transcriptomics enables scalable, molecularly informed vascular quantification directly from routine H&E slides. Within the astrocytoma subtype, where tumor grade is most heterogeneous and vascular phenotype most variable, objective vascular quantification provides independent prognostic information demonstrating the potential of spatially supervised deep learning to extract clinically meaningful microenvironmental signals from universally available histologic material.

18
Label-free 3D virtual histology of human formalin-fixed paraffin-embedded (FFPE) prostate needle biopsies with propagation-based phase-contrast micro-CT (PBCT)

Sugarman, A. L.; Vanselow, D. J.; Chen, G.; Clark, E.; Parkinson, D.; La Riviere, P.; Silverman, J.; Warrick, J.; Cheng, K. C. C.

2026-06-01 pathology 10.64898/2026.05.28.728215 medRxiv
Top 0.1%
12.3%
Show abstract

For over a century, the goal of estimating clinical outcome from tumor biopsies has been based on histomorphology of 2D tissue slices that represent a small fraction of collected samples. Its power derives from histologys 1) unbiased representation of cell types, 2) subcellular resolution that allows the characterization of health and disease states across cell types, and 3) multi-millimeter fields of view that allow assessment of tumor heterogeneity. Histologys dependence upon physical slices, however, limits assessment of 3-dimensional cellular volumes and tissue architecture. Here, we used propagation-based phase-contrast micro-CT (PBCT) to create 3D histological images of residual formalin-fixed, paraffin-embedded (FFPE) prostate needle biopsies. The resulting isotropic, grey-scale, 0.5 micron voxel matrices were used to explore the potential of for the 3D virtual histology to distinguish diagnostic categories including benign prostatic tissue and prostatic adenocarcinoma of Gleason patterns 3, 4, and 5. Maximum intensity projections of stacks of digital slices totaling 5 microns "slices" allowed the study of virtual sections corresponding to actual serial H&E-stained sections of tissue cut after micro-CT imaging. Like histology, our PBCT reconstructions allowed us to distinguish between non-infiltrative and undulating glands of benign prostatic tissue, infiltrative round glands of Gleason pattern 3, cribriform structures of Gleason pattern 4, and comedonecrosis of Gleason pattern 5. Unlike histology, micro-CT allowed us to further probe 3D tissue architecture in volumetric context. User-friendly exploration of sample volumes was achieved using a customized Neuroglancer multiplanar and 3D rendering interface. Sparsely trained cycleGAN produced plausible virtual H&E staining from the unstained micro-CT reconstructions. Unlike tissue-section based histology, micro-CT-based virtual histology yields nondestructive 3D characterization of cancer cell and tissue architecture, including glandular spaces, without the undersampling or cutting artifacts of histology. These findings demonstrate the feasibility of PBCT-based 3D virtual histology of prostate cancer and suggest the exploration of derived quantitative analyses of tumor properties for potential contributions to patient care.

19
An Interactive Trustworthy AI Pathology Copilot to Improve Biomarker-Driven Prognostic Stratification and Therapeutic Response Prediction

Mao, Y.; Xie, C.; Li, F.; Li, D.; Zhang, W.; Zhang, Y.; Li, B.; Zhao, C.; Zhang, Z.; Tan, Y.; Cen, Z.; Tao, H.; Yang, J.; Wang, J.; Feng, Q.; Liu, B.; Liang, L.; Lu, C.; Zhang, Y.; Ning, Z.

2026-05-19 pathology 10.64898/2026.05.17.26352870 medRxiv
Top 0.1%
10.3%
Show abstract

Predictive assays for precision oncology increasingly rely on multi-scale biomarkers that manifest as morphologic signatures in routine whole-slide images (WSIs). However, most computational pathology models treat biomarker profiling and outcome prediction (i.e., prognostic stratification and therapeutic response) as independent tasks, and lack the interactive and trustworthy capabilities required for clinical translation. Here, we present TEAM, an interactive trustworthy AI pathology copilot that improves biomarker-driven outcome prediction. Pretrained on 55,648 pan-cancer WSIs and 1,750,648 regions of interest (ROIs), comprising 360 million patches, TEAM learns risk-aware embeddings by conditioning on clinical metadata and aligning with relative risk prior. For trustworthy assessment, TEAM quantifies patch-level data (aleatoric) and model (epistemic) uncertainty, then propagates these estimates to patient-level predictions. In outcome prediction, profiled biomarkers serve as intermediate features to contextualize prognostic and therapeutic estimates. Beyond passive prediction, TEAM integrates vision-language models with agentic orchestration for clinical reasoning, and provides a web-based clinician-in-the-loop interface for interactive prediction refinement. Evaluated across 48 multi-institutional cohorts encompassing 85 benchmarks, TEAM consistently outperforms existing methods across biomarker profiling, prognostic stratification, and therapeutic response prediction, supporting trustworthy AI-assisted decision-making in computational pathology.

20
Development and Validation of a Multimodal AI-Based Model for Predicting Post-Prostatectomy Treatment Outcomes from Baseline Biparametric Prostate MRI

Simon, B. D.; Akcicek, E.; Harmon, S. A.; Clifton, L. D.; Thakur, A.; Gurram, S.; Clifton, D.; Wood, B. J.; Karaosmanoglu, A. D.; Choyke, P. L.; Akata, D.; Pinto, P. A.; Turkbey, B.

2026-03-22 urology 10.64898/2026.03.19.26348716 medRxiv
Top 0.1%
8.8%
Show abstract

Prostate cancer (PCa) is the second most common cancer and cause of cancer death in American men. Existing risk prediction methods have limited accuracy and reproducibility, resulting in difficulty in predicting disease severity. We demonstrate the development and external validation of an automated multimodal artificial intelligence algorithm using biparametric MRI (bpMRI) and clinical covariates for predicting biochemical recurrence (BCR) after radical prostatectomy (RP) in PCa patients. Development cohort included 80% of patients from center 1 (n = 240) who underwent prostate MRI prior to RP between January 2008 and December 2018 with a minimum of two years of follow-up after RP. Test cohort included the remaining 20% of center 1 patients (n = 71), and the external validation cohort from center 2 (n = 168). Center 2 patients included those who underwent prostate MRI and RP between January 2015 and December 2024 with a minimum of two years of follow-up. Clinical comparisons were CAPRA-S (center 1) and ISUP grade group from post-RP biopsy (center 2). Models developed were a clinical model (M0), an automated clinical model (M1), a radiomics model (M2), and a multimodal model (M3). Clinical variables (M0) included PSA, age, primary Gleason, and ISUP grade group. Automated clinical variables (M1 and M3) included PSA and age. Radiomics features (M2 and M3) were extracted from bpMRI using a lesion detection algorithm. Accuracy, sensitivity, specificity, and AUC were calculated, and log-rank tests compared BCR-free survival to assess the models ability to discriminate relative to clinical standards. Intermediate-risk groups were also assessed. The multimodal model (M3) had the highest AUC across test sets (combined: 0.71; center 1: 0.70; center 2: 0.75) and was the only model to significantly differentiate BCR-free survival outcomes in intermediate-risk groups across both centers (p < 0.05). This automated multimodal model leveraging radiomics and clinical covariates can predict BCR after RP, approaches clinical gold standards, and may enhance imaging-based prognostication following further validation.